A Probabilistic Parser

نویسندگان

  • Roger Garside
  • Fanny Leech
چکیده

The UCREL team at the University of Lancaster is engaged in the development of a robust parsing mechanism, which will assign the appropriate grammatical structure to sentences in unconstrained English text. The techniques used involve the calculation of probabilities for competing structures, and are based on the techniques successfully used in tagging (i.e. assigning grammatical word classes) to the LOB (Lancaster-Oslo/Bergen) corpus. The first step in the parsing process involves dictionary lookup of successive pairs of grammatically tagged words, to give a number of possible continuations to the current parse. Since this lookup will often not be able unambiguously to distinguish the point at which a grammatical constituent should be closed, the second step of the parsing process will have to insert closures and distinguish between alternative parses. It will generate trees representing these possible alternatives, insert closure points for the constituents, and compute a probability for each parse tree from the probability of each constituent within the tree. It will then be able to select a preferred parse or parses for output. The probability of a grammatical constituent is derived from a bank of manually parsed sentences.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Studying impressive parameters on the performance of Persian probabilistic context free grammar parser

In linguistics, a tree bank is a parsed text corpus that annotates syntactic or semantic sentence structure. The exploitation of tree bank data has been important ever since the first large-scale tree bank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of tree bank is becoming more widely appreciated in linguistics research as a whole. F...

متن کامل

A Probabilistic Earley Parser as a Psycholinguistic Model

In human sentence processing, cognitive load can be defined many ways. This report considers a definition of cognitive load in terms of the total probability of structural options that have been disconfirmed at some point in a sentence: the surprisal of word wi given its prefix w0...i−1 on a phrase-structural language model. These loads can be efficiently calculated using a probabilistic Earley...

متن کامل

A Neural Probabilistic Structured-Prediction Model for Transition-Based Dependency Parsing

Neural probabilistic parsers are attractive for their capability of automatic feature combination and small data sizes. A transition-based greedy neural parser has given better accuracies over its linear counterpart. We propose a neural probabilistic structured-prediction model for transition-based dependency parsing, which integrates search and learning. Beam search is used for decoding, and c...

متن کامل

Feature Engineering in Persian Dependency Parser

Dependency parser is one of the most important fundamental tools in the natural language processing, which extracts structure of sentences and determines the relations between words based on the dependency grammar. The dependency parser is proper for free order languages, such as Persian. In this paper, data-driven dependency parser has been developed with the help of phrase-structure parser fo...

متن کامل

Question Identification Using a Probabilistic Context Free Grammar

This paper shows that using the tree structure generated from a Probabilistic Context Free Grammar parser adds meaningful information to language processing tasks, in particular, question identification. By using a part-of-speech representation of a sentence as a base line, this paper’s results show that adding features derived from the tree output of a Probabilistic Context Free Grammar parser...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1985